77 research outputs found

    Large quadratic programs in training gaussian support vector machines

    Get PDF
    We consider the numerical solution of the large convex quadratic program arising in training the learning machines named support vector machines. Since the matrix of the quadratic form is dense and generally large, solution approaches based on explicitstorage of this matrix are not practicable. Well known strategies for this quadratic program are based on decomposition techniques that split the problem into a sequence of smaller quadratic programming subproblems. For the solution of these subproblems we present an iterative projection-type method suited for the structure of the constraints and very eective in case of Gaussian support vector machines. We develop an appropriate decomposition technique designed to exploit the high performance of the proposed inner solver on medium or large subproblems. Numerical experiments on large-scale benchmark problems allow to compare this approach with another widelyused decomposition technique. Finally, a parallel extension of the proposed strategy is described

    Shearlet-based regularized reconstruction in region-of-interest computed tomography

    Get PDF
    Region of interest (ROI) tomography has gained increasing attention in recent years due to its potential to reducing radiation exposure and shortening the scanning time. However, tomographic reconstruction from ROI-focused illumination involves truncated projection data and typically results in higher numerical instability even when the reconstruction problem has unique solution. To address this problem, both ad hoc analytic formulas and iterative numerical schemes have been proposed in the literature. In this paper, we introduce a novel approach for ROI tomographic reconstruction, formulated as a convex optimization problem with a regularized term based on shearlets. Our numerical implementation consists of an iterative scheme based on the scaled gradient projection method and it is tested in the context of fan-beam CT. Our results show that our approach is essentially insensitive to the location of the ROI and remains very stable also when the ROI size is rather small.Peer reviewe

    Improving the angular resolution of coded aperture instruments using a modified Lucy-Richardson algorithm for deconvolution

    Get PDF
    A problem with coded-mask telescopes is the achievable angular resolution. For example, with the standard cross-correlation (CC) analysis, the INTEGRAL IBIS/ISGRI angular resolution is about 13'. We are currently investigating an iterative Lucy-Richardson (LR) algorithm. The LR algorithm can be used effectively when the PSF is known, but little or no information is available for the noise. This algorithm maximizes the probability of the restored image, under the assumption that the noise is Poisson distributed, which is appropriate for photon noise in the data, and converges to the maximum likelihood solution. We have modified the classical LR algorithm, adding non-negative constraints. It doesn't take into account of the features leading to a difference in PSF depending on position in the field of view (dead pixels, gaps between modules etc), which are easily corrected for in the classical CC analysis, so we must correct for these either after the restoration of the image or by modifing the data before the sky reconstruction. We present some results using real IBIS data indicating the power of the proposed reconstruction algorithm

    A Resource Aware MapReduce Based Parallel SVM for Large Scale Image Classifications

    Get PDF
    Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them support vector machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents RASMO, a resource aware MapReduce based parallel SVM algorithm for large scale image classifications which partitions the training data set into smaller subsets and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of RASMO in heterogeneous computing environments. RASMO is evaluated in both experimental and simulation environments. The results show that the parallel SVM algorithm reduces the training time significantly compared with the sequential SMO algorithm while maintaining a high level of accuracy in classifications.National Basic Research Program (973) of China under Grant 2014CB34040

    Parallel computational experience and dynamic scaling for a class of nonlinear systems

    No full text
    This work deals with computational and theoretical aspects of a particular class of algebraic nonlinear systems, known as block bordered. The domain decomposition techniques applied to the modeling of many real world problems give frequently nonlinear systems of this kind and iterative, Newton-like approaches to the solution are widely used. The PICRN algorithm of Feng and Schnabel is an implicit, two stage quasi-Newton method that takes advantage from the structure of the Jacobian system and is very well parallelizable on distributed memory multiprocessors architectures. Here we give the results of an experimentation carried on a nonlinear, particularly ill-conditioned system coming from VLSI simulation, performed on a Cray T3D using the Cray MPP Fortran in virtual shared memory environment. This numerical experience allows to study the algorithm behaviour and gives the reason for the dynamic scaling successively proposed and analyzed

    Coordinamento del progetto di ricerca per l'acquisizione e la gestione di un server ibrido CPU/GPU di ultima generazione per elaborazioni scientifiche

    No full text
    Coordinamento del progetto per l'acquisizione e gestione di una attrezzatura all'avanguardia per l'elaborazione scientifica. Tale attrezzatura è costituita da un host multiprocessore con due CPU XEON six core a 3.4 GHz e da 4 schede GPU NVIDIA Fermi da 448 core a 64 bit. La memoria RAM centrale condivisa è di 198 GB DDR3 ECC e quella di massa da 6 TB su HD SATA/II. Il costo totale dell'architettura è di 25560 euro. Il finanziamento totale ottenuto dall'Università di Ferrara è di 16760 euro, con un cofinanziamento di 8800 euro da parte dei ricercatori proponenti. L'acquisizione dell'attrezzatura si inquadra nell’ambito della Matematica Applicata e del Calcolo Scientifico per la soluzione di complessi problemi applicativi provenienti da numerosi campi di ricerca. L'intero progetto scientifico di utilizzo coinvolge ricercatori di Ateneo afferenti a molte aree: Analisi Numerica, Ottimizzazione, Microscopia Elettronica, Ingegneria, Ricerca Operativa, Apprendimento Automatico, Fisica Medica, Bioinformatica, Genomica, Geofisica, Fisica, Biologia Molecolare

    Parallel Joint Inversion

    No full text
    This ISCRA class C project has two main aims: (1) complete both the development and the test of an innovative approach to the problem of reconstructing observed objects by jointly inverting noninvasive, differently acquired observational data sets; (2) apply the proposed new approach, as well as recent state-of-the-art methods, to large-scale 3D geophysical data and possibly to electronic microscopy data. Inverse problems is an extremely active research field, which got in recent years increased attention due to both the huge amount of data provided by new technologies and the impressive development of effective methods. We are interested in an innovative approach to complex, large-scale volumetric imaging applications in Geophysics, but it can be very useful also in a number of other fields such as Medicine, Astronomy and Microscopy. The idea is that by jointly using more than one data set in the inversion process, one can get better results than those obtainable by separately processing each single data set and then "merging" the outcomes. This situation is very common in many different fields, where multiple sets of observations of the same physical system are acquired by different techniques. In the usual approach, each data set is used separately to recover an estimate of the system at hand and only afterwards these estimates are assembled together, most often on the basis of geometrical/physical considerations of the experts in the field. The current theoretical knowledge and the algorithmic advances on the numerical optimization side allow to face the harder difficulties associated with the solution of the joint inversion problem. Following the Tikhonov's approach, we consider to minimize a functional given by a misfit and a regularization term for each object's property to be reconstructed and, differently from the classical approach, a "joining" term which depends on all the investigated properties and embeds the additional requirement we want to impose to the solution. A large amount of data can thus be treated at the same time, coming from models obtained by different discretization techniques. We developed the current version of the code based on PETSc and TAO parallel libraries, using C and MPI. To face real-world applications, the power of modern multiprocessor architectures is strongly needed. In addition to the described benefits, a further outcome of this project will be to allow facing the difficult task of studying and testing techniques for the optimal choice of the various parameters involved, which most often requires to solve the same class of problems many times. The project is connected with Ambra Giovannini's Ph.D. work. PROJECT'S BUDGET: 15000 core hours. (CINECA's national ISCRA projects are issued as part of the European PRACE Tier0 access initiative

    Large-scale parallel computing for inverse problems in imaging

    No full text
    The project aims to address the development of parallel codes for the solution of large- and huge-scale inverse problems in imaging. The starting points are the codes developed and analyzed in previous ISCRA class C projects ("ParJoInv" and "PANOIP"), which in different ways showed that the challenging inversion problems currently arising in a number of applications fields, such as Geophysics, Medicine, Astronomy, Microscopy, etc., can effectively be faced in reasonable time only with the support of parallel codes for HPC architectures. These problems usually belong to the class of simply constrained nonlinear programming (NLPs) problems and require the minimization of (usually heavy) nonlinear functionals. The functionals are generally composed by one (or more) best-fit term and one (or more) regularization terms. The size of the resulting NLPs overcomes very quickly the tens of millions variables. We will consider first-order iterative methods for the solution, particularly the scaled gradient-projection (SGP) approach. This project is aimed to let the research codes developed so far make a step ahead, by both optimizing the code's structure and by introducing hybrid programming, that is by mixing inter-node distributed-memory computations (the MPI-related part) with intra-node shared-memory multithreaded computations (the OpenMP-related part). Here the computing nodes are thought to be multicore CPUs. This is an higher level of parallelization, which in recent years has shown to be very effective in catching the benefits of both the programming paradigms. It will be part of the project's goals to verify whether and how this hybrid approach is compliant and suitable with PETSc and TAO libraries. An additional goal is the integration of GPUs and multicore CPUs: these configurations are increasingly appealing for their reduced costs and good performances. The implementation of the SGP approach within such a mixed environment is surely possible, but has a number of nontrivial issues to face. The project's potential outcomes are impressive: for instance, we expect to face huge-scale 3D Microscopy imaging problems provided by next-generation devices. Also, we plan to compute solutions of 3D Geophysics reconstructions of very large volumes. PROJECT'S BUDGET: 50000 core hours (the maximum allowed). (CINECA's national ISCRA projects are issued as part of the European PRACE Tier0 access initiative

    Some theoretical properties of Feng-Schnabel algorithm for block bordered nonlinear systems

    No full text
    Large and sparse nonlinear systems arise in many areas of science and technology, very often as a core process for the model of a real world problem. Newton-like approaches to their solution imply the computation of a (possibly approximated) Jacobian: in the case of block bordered systems this results in a matrix with disjoint square blocks on the main diagonal, plus a final set of rows and columns. This sparsity class allows to develop multistage Newton-like methods (with inner and outer iterations) that are very suitable for a parallel implementation ou multiprocessors computers. Recently, Feng and Schnabel proposed an algorithm which is actually the state of the art in this field. In this paper we analyze in depth important theoretical properties of the steps generated by the Feng-Schnabel algorithm. Then we study a cheap modification that gives an improvement of the direction properties, allowing a global convergence result, as well as the extension of the convergence to a broader class of algorithms, in which different linesearch globalization rules can be applied
    • …
    corecore